Overview

Dataset Statistics

Number of Variables 24
Number of Rows 20594
Missing Cells 2
Missing Cells (%) 0.0%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 15.6 MB
Average Row Size in Memory 792.7 B
Variable Types
  • Categorical: 9
  • Numerical: 15

Dataset Insights

Loudness is skewed Skewed
Speechiness is skewed Skewed
Acousticness is skewed Skewed
Instrumentalness is skewed Skewed
Liveness is skewed Skewed
Duration_min is skewed Skewed
Views is skewed Skewed
Likes is skewed Skewed
Comments is skewed Skewed
Stream is skewed Skewed
EnergyLiveness is skewed Skewed
Artist has a high cardinality: 2074 distinct values High Cardinality
Track has a high cardinality: 17717 distinct values High Cardinality
Album has a high cardinality: 11854 distinct values High Cardinality
Title has a high cardinality: 18023 distinct values High Cardinality
Channel has a high cardinality: 6673 distinct values High Cardinality
most_playedon has constant length 7 Constant Length
Loudness has 20586 (99.96%) negatives Negatives
Instrumentalness has 9319 (45.25%) zeros Zeros
Comments has 1065 (5.17%) zeros Zeros
  • 1
  • 2

Variables


Artist

categorical

Approximate Distinct Count 2074
Approximate Unique (%) 10.1%
Missing 0
Missing (%) 0.0%
Memory Size 1615461

Length

Mean 10.9881
Standard Deviation 4.846
Median 11
Minimum 2
Maximum 45

Sample

1st row Gorillaz
2nd row Gorillaz
3rd row Gorillaz
4th row Gorillaz
5th row Gorillaz

Letter

Count 203516
Lowercase Letter 160245
Space Separator 18410
Uppercase Letter 43271
Dash Punctuation 310
Decimal Number 919
  • The largest value (the) is over 2.69 times larger than the second largest value (los)

Track

categorical

Approximate Distinct Count 17717
Approximate Unique (%) 86.0%
Missing 0
Missing (%) 0.0%
Memory Size 1839673

Length

Mean 19.5107
Standard Deviation 14.2397
Median 15
Minimum 1
Maximum 195

Sample

1st row Feel Good Inc.
2nd row Rhinestone Eyes
3rd row New Gold (feat. Ta...
4th row On Melancholy Hill
5th row Clint Eastwood

Letter

Count 320111
Lowercase Letter 248914
Space Separator 55870
Uppercase Letter 71197
Dash Punctuation 3024
Decimal Number 5079
  • The largest value (feat) is over 1.56 times larger than the second largest value (the)

Album

categorical

Approximate Distinct Count 11854
Approximate Unique (%) 57.6%
Missing 0
Missing (%) 0.0%
Memory Size 1848058

Length

Mean 20.4038
Standard Deviation 14.4793
Median 16
Minimum 1
Maximum 195

Sample

1st row Demon Days
2nd row Plastic Beach
3rd row New Gold (feat. Ta...
4th row Plastic Beach
5th row Gorillaz

Letter

Count 344023
Lowercase Letter 267714
Space Separator 51203
Uppercase Letter 76309
Dash Punctuation 917
Decimal Number 5398
  • The largest value (the) is over 2.68 times larger than the second largest value (edition)

Album_type

categorical

Approximate Distinct Count 3
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 1451275
  • The largest value (album) is over 2.98 times larger than the second largest value (single)

Length

Mean 5.4708
Standard Deviation 1.1814
Median 5
Minimum 5
Maximum 11

Sample

1st row album
2nd row album
3rd row single
4th row album
5th row album

Letter

Count 112665
Lowercase Letter 112665
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (album, single) take over 50.0%
  • The largest value (album) is over 2.98 times larger than the second largest value (single)

Danceability

numerical

Approximate Distinct Count 898
Approximate Unique (%) 4.4%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 329504
Mean 0.6201
Minimum 0
Maximum 0.975
Zeros 19
Zeros (%) 0.1%
Negatives 0
Negatives (%) 0.0%
  • Danceability is skewed left (γ1 = -0.5571)

Quantile Statistics

Minimum 0
5-th Percentile 0.318
Q1 0.519
Median 0.638
Q3 0.741
95-th Percentile 0.861
Maximum 0.975
Range 0.975
IQR 0.222

Descriptive Statistics

Mean 0.6201
Standard Deviation 0.1655
Variance 0.02739
Sum 12770.3767
Skewness -0.5571
Kurtosis 0.1514
Coefficient of Variation 0.2669
  • Danceability is not normally distributed (p-value 1.906845668333852e-05)
  • Danceability has 271 outliers

Energy

numerical

Approximate Distinct Count 1268
Approximate Unique (%) 6.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 329504
Mean 0.6352
Minimum 0
Maximum 1
Zeros 2
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Energy is skewed left (γ1 = -0.7175)

Quantile Statistics

Minimum 0
5-th Percentile 0.219
Q1 0.507
Median 0.666
Q3 0.798
95-th Percentile 0.929
Maximum 1
Range 1
IQR 0.291

Descriptive Statistics

Mean 0.6352
Standard Deviation 0.2143
Variance 0.04591
Sum 13080.816
Skewness -0.7175
Kurtosis 0.1438
Coefficient of Variation 0.3373
  • Energy is not normally distributed (p-value 6.03463512448569e-10)
  • Energy has 366 outliers

Loudness

numerical

Approximate Distinct Count 9388
Approximate Unique (%) 45.6%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 329504
Mean -7.6783
Minimum -46.251
Maximum 0.92
Zeros 2
Zeros (%) 0.0%
Negatives 20586
Negatives (%) 100.0%
  • Loudness is skewed left (γ1 = -2.7)

Quantile Statistics

Minimum -46.251
5-th Percentile -15.9253
Q1 -8.868
Median -6.5405
Q3 -4.935
95-th Percentile -3.2037
Maximum 0.92
Range 47.171
IQR 3.933

Descriptive Statistics

Mean -7.6783
Standard Deviation 4.6395
Variance 21.5248
Sum -158125.956
Skewness -2.7
Kurtosis 10.7141
Coefficient of Variation -0.6042
  • Loudness is not normally distributed (p-value 2.228507689295787e-08)
  • Loudness has 1284 outliers

Speechiness

numerical

Approximate Distinct Count 1303
Approximate Unique (%) 6.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 329504
Mean 0.09673
Minimum 0
Maximum 0.964
Zeros 19
Zeros (%) 0.1%
Negatives 0
Negatives (%) 0.0%
  • Speechiness is skewed right (γ1 = 3.3659)

Quantile Statistics

Minimum 0
5-th Percentile 0.0278
Q1 0.0357
Median 0.05065
Q3 0.104
95-th Percentile 0.324
Maximum 0.964
Range 0.964
IQR 0.0683

Descriptive Statistics

Mean 0.09673
Standard Deviation 0.1122
Variance 0.01258
Sum 1992.1204
Skewness 3.3659
Kurtosis 16.4211
Coefficient of Variation 1.1597
  • Speechiness is not normally distributed (p-value 5.0046930851914e-17)
  • Speechiness has 2591 outliers

Acousticness

numerical

Approximate Distinct Count 3131
Approximate Unique (%) 15.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 329504
Mean 0.2914
Minimum 0
Maximum 0.996
Zeros 2
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Acousticness is skewed right (γ1 = 0.8839)

Quantile Statistics

Minimum 0
5-th Percentile 0.00161
Q1 0.0452
Median 0.193
Q3 0.4768
95-th Percentile 0.885
Maximum 0.996
Range 0.996
IQR 0.4315

Descriptive Statistics

Mean 0.2914
Standard Deviation 0.2861
Variance 0.08186
Sum 6000.9048
Skewness 0.8839
Kurtosis -0.3791
Coefficient of Variation 0.9819
  • Acousticness is not normally distributed (p-value 9.873340556349851e-20)

Instrumentalness

numerical

Approximate Distinct Count 4005
Approximate Unique (%) 19.4%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 329504
Mean 0.05616
Minimum 0
Maximum 1
Zeros 9319
Zeros (%) 45.2%
Negatives 0
Negatives (%) 0.0%
  • Instrumentalness is skewed right (γ1 = 3.7125)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 2.42e-06
Q3 0.00047375
95-th Percentile 0.582
Maximum 1
Range 1
IQR 0.00047375

Descriptive Statistics

Mean 0.05616
Standard Deviation 0.1936
Variance 0.03749
Sum 1156.5974
Skewness 3.7125
Kurtosis 12.6026
Coefficient of Variation 3.4476
  • Instrumentalness is not normally distributed (p-value 4.331628349226016e-25)
  • Instrumentalness has 4413 outliers

Liveness

numerical

Approximate Distinct Count 1537
Approximate Unique (%) 7.5%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 329504
Mean 0.1937
Minimum 0
Maximum 1
Zeros 2
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Liveness is skewed right (γ1 = 2.3077)

Quantile Statistics

Minimum 0
5-th Percentile 0.0583
Q1 0.0941
Median 0.125
Q3 0.237
95-th Percentile 0.574
Maximum 1
Range 1
IQR 0.1429

Descriptive Statistics

Mean 0.1937
Standard Deviation 0.1688
Variance 0.0285
Sum 3988.0956
Skewness 2.3077
Kurtosis 5.8287
Coefficient of Variation 0.8718
  • Liveness is not normally distributed (p-value 1.2471774903572915e-11)
  • Liveness has 1500 outliers

Valence

numerical

Approximate Distinct Count 1293
Approximate Unique (%) 6.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 329504
Mean 0.5301
Minimum 0
Maximum 0.993
Zeros 28
Zeros (%) 0.1%
Negatives 0
Negatives (%) 0.0%
  • Valence is skewed left (γ1 = -0.1029)

Quantile Statistics

Minimum 0
5-th Percentile 0.119
Q1 0.34
Median 0.538
Q3 0.727
95-th Percentile 0.921
Maximum 0.993
Range 0.993
IQR 0.387

Descriptive Statistics

Mean 0.5301
Standard Deviation 0.2455
Variance 0.06029
Sum 10916.4046
Skewness -0.1029
Kurtosis -0.9281
Coefficient of Variation 0.4632

Tempo

numerical

Approximate Distinct Count 14954
Approximate Unique (%) 72.6%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 329504
Mean 120.5626
Minimum 0
Maximum 243.372
Zeros 19
Zeros (%) 0.1%
Negatives 0
Negatives (%) 0.0%
  • Tempo is skewed right (γ1 = 0.3889)

Quantile Statistics

Minimum 0
5-th Percentile 78.3993
Q1 96.994
Median 119.959
Q3 139.9235
95-th Percentile 174.7222
Maximum 243.372
Range 243.372
IQR 42.9295

Descriptive Statistics

Mean 120.5626
Standard Deviation 29.5881
Variance 875.4553
Sum 2.4829e+06
Skewness 0.3889
Kurtosis -0.1063
Coefficient of Variation 0.2454
  • Tempo has 65 outliers

Duration_min

numerical

Approximate Distinct Count 14609
Approximate Unique (%) 70.9%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 329504
Mean 3.7424
Minimum 0
Maximum 77.9343
Zeros 2
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Duration_min is skewed right (γ1 = 23.3346)

Quantile Statistics

Minimum 0
5-th Percentile 2.2271
Q1 2.9967
Median 3.5513
Q3 4.2022
95-th Percentile 5.599
Maximum 77.9343
Range 77.9343
IQR 1.2054

Descriptive Statistics

Mean 3.7424
Standard Deviation 2.0852
Variance 4.3481
Sum 77071.779
Skewness 23.3346
Kurtosis 784.6852
Coefficient of Variation 0.5572
  • Duration_min is not normally distributed (p-value 2.444079695686895e-20)
  • Duration_min has 801 outliers

Title

categorical

Approximate Distinct Count 18023
Approximate Unique (%) 87.5%
Missing 0
Missing (%) 0.0%
Memory Size 2615847
  • The largest value (0) is over 26.06 times larger than the second largest value (Color Esperanza 2020 - Various Artists (Official Video))

Length

Mean 48.9538
Standard Deviation 21.0608
Median 47
Minimum 1
Maximum 188

Sample

1st row Gorillaz - Feel Go...
2nd row Gorillaz - Rhinest...
3rd row Gorillaz - New Gol...
4th row Gorillaz - On Mela...
5th row Gorillaz - Clint E...

Letter

Count 762400
Lowercase Letter 583173
Space Separator 160158
Uppercase Letter 179227
Dash Punctuation 18263
Decimal Number 8460

Channel

categorical

Approximate Distinct Count 6673
Approximate Unique (%) 32.4%
Missing 0
Missing (%) 0.0%
Memory Size 1633801
  • The largest value (0) is over 1.97 times larger than the second largest value (T-Series)

Length

Mean 12.9505
Standard Deviation 5.4999
Median 13
Minimum 1
Maximum 60

Sample

1st row Gorillaz
2nd row Gorillaz
3rd row Gorillaz
4th row Gorillaz
5th row Gorillaz

Letter

Count 245722
Lowercase Letter 168568
Space Separator 14628
Uppercase Letter 77154
Dash Punctuation 1545
Decimal Number 2712

Views

numerical

Approximate Distinct Count 19123
Approximate Unique (%) 92.9%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 329504
Mean 9.2037e+07
Minimum 0
Maximum 8079649362
Zeros 470
Zeros (%) 2.3%
Negatives 0
Negatives (%) 0.0%
  • Views is skewed right (γ1 = 9.3073)

Quantile Statistics

Minimum 0
5-th Percentile 15116.75
Q1 1.4783e+06
Median 1.3313e+07
Q3 6.7397e+07
95-th Percentile 4.2747e+08
Maximum 8079649362
Range 8079649362
IQR 6.5919e+07

Descriptive Statistics

Mean 9.2037e+07
Standard Deviation 2.726e+08
Variance 7.4312e+16
Sum 1.8954e+12
Skewness 9.3073
Kurtosis 150.9859
Coefficient of Variation 2.9619
  • Views is not normally distributed (p-value 6.106110228147731e-25)
  • Views has 2696 outliers

Likes

numerical

Approximate Distinct Count 17830
Approximate Unique (%) 86.6%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 329504
Mean 647990.1538
Minimum 0
Maximum 50788652
Zeros 557
Zeros (%) 2.7%
Negatives 0
Negatives (%) 0.0%
  • Likes is skewed right (γ1 = 8.7463)

Quantile Statistics

Minimum 0
5-th Percentile 183
Q1 17542
Median 115315.5
Q3 500019.75
95-th Percentile 2.9768e+06
Maximum 50788652
Range 50788652
IQR 482477.75

Descriptive Statistics

Mean 647990.1538
Standard Deviation 1.7736e+06
Variance 3.1458e+12
Sum 1.3345e+10
Skewness 8.7463
Kurtosis 137.7962
Coefficient of Variation 2.7372
  • Likes is not normally distributed (p-value 6.635471577923714e-25)
  • Likes has 2625 outliers

Comments

numerical

Approximate Distinct Count 10429
Approximate Unique (%) 50.6%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 329504
Mean 26846.7897
Minimum 0
Maximum 16083138
Zeros 1065
Zeros (%) 5.2%
Negatives 0
Negatives (%) 0.0%
  • Comments is skewed right (γ1 = 44.1567)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 406
Median 3006
Q3 13736.75
95-th Percentile 99313.65
Maximum 16083138
Range 16083138
IQR 13330.75

Descriptive Statistics

Mean 26846.7897
Standard Deviation 191175.1051
Variance 3.6548e+10
Sum 5.5288e+08
Skewness 44.1567
Kurtosis 2923.1781
Coefficient of Variation 7.121
  • Comments is not normally distributed (p-value 4.240914694139906e-25)
  • Comments has 2677 outliers

Licensed

categorical

Approximate Distinct Count 3
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 1425644
  • The largest value (TRUE) is over 2.32 times larger than the second largest value (FALSE)

Length

Mean 4.2262
Standard Deviation 0.6696
Median 4
Minimum 1
Maximum 5

Sample

1st row TRUE
2nd row TRUE
3rd row TRUE
4th row TRUE
5th row TRUE

Letter

Count 86565
Lowercase Letter 0
Space Separator 0
Uppercase Letter 86565
Dash Punctuation 0
Decimal Number 469
  • The top 2 categories (TRUE, FALSE) take over 50.0%
  • The largest value (true) is over 2.32 times larger than the second largest value (false)

official_video

categorical

Approximate Distinct Count 3
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 1424069
  • The largest value (TRUE) is over 3.48 times larger than the second largest value (FALSE)

Length

Mean 4.1497
Standard Deviation 0.6329
Median 4
Minimum 1
Maximum 5

Sample

1st row TRUE
2nd row TRUE
3rd row TRUE
4th row TRUE
5th row TRUE

Letter

Count 84990
Lowercase Letter 0
Space Separator 0
Uppercase Letter 84990
Dash Punctuation 0
Decimal Number 469
  • The top 2 categories (TRUE, FALSE) take over 50.0%
  • The largest value (true) is over 3.48 times larger than the second largest value (false)

Stream

numerical

Approximate Distinct Count 18339
Approximate Unique (%) 89.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 329504
Mean 1.3264e+08
Minimum 0
Maximum 3386520288
Zeros 576
Zeros (%) 2.8%
Negatives 0
Negatives (%) 0.0%
  • Stream is skewed right (γ1 = 4.1473)

Quantile Statistics

Minimum 0
5-th Percentile 1.1617e+06
Q1 1.5591e+07
Median 4.7305e+07
Q3 1.3435e+08
95-th Percentile 5.6962e+08
Maximum 3386520288
Range 3386520288
IQR 1.1875e+08

Descriptive Statistics

Mean 1.3264e+08
Standard Deviation 2.4236e+08
Variance 5.8738e+16
Sum 2.7317e+12
Skewness 4.1473
Kurtosis 23.561
Coefficient of Variation 1.8271
  • Stream is not normally distributed (p-value 3.2615844381779644e-23)
  • Stream has 2278 outliers

EnergyLiveness

numerical

Approximate Distinct Count 17433
Approximate Unique (%) 84.7%
Missing 2
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 329472
Mean 5.1672
Minimum 4.87e-05
Maximum 59.1139
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • EnergyLiveness is skewed right (γ1 = 2.692)

Quantile Statistics

Minimum 4.87e-05
5-th Percentile 0.8993
Q1 2.3862
Median 4.2569
Q3 6.822
95-th Percentile 12.5126
Maximum 59.1139
Range 59.1139
IQR 4.4358

Descriptive Statistics

Mean 5.1672
Standard Deviation 4.1174
Variance 16.9532
Sum 106403.5348
Skewness 2.692
Kurtosis 16.0036
Coefficient of Variation 0.7968
  • EnergyLiveness is not normally distributed (p-value 7.69173838128051e-09)
  • EnergyLiveness has 798 outliers

most_playedon

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 1482768
  • The largest value (Spotify) is over 3.2 times larger than the second largest value (Youtube)

Length

Mean 7
Standard Deviation 0
Median 7
Minimum 7
Maximum 7

Sample

1st row Spotify
2nd row Spotify
3rd row Spotify
4th row Spotify
5th row Youtube

Letter

Count 144158
Lowercase Letter 123564
Space Separator 0
Uppercase Letter 20594
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Spotify, Youtube) take over 50.0%
  • The largest value (spotify) is over 3.2 times larger than the second largest value (youtube)
  • most_playedon has words of constant length

Interactions

Correlations

Missing Values